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Abstract: Most of deterministic solvers for rarefied gas dynamics use discrete velocity 

(or discrete ordinate) approximations of the distribution function on a Cartesian grid. This 

grid must be sufficiently large and fine to describe the distribution functions at every space 

position in the computational domain. For 3-dimensional hypersonic flows, like in re-entry 

1-^ problems, this induces much too dense velocity grids that cannot be practically used, for 

c^ memory storage requirements. In this article, we present an approach to generate automat- 

G ically a locally reflned velocity grid adapted to a given simulation. This grid contains much 

less points than a standard Cartesian grid and allows us to make realistic 3-dimensional 

^ simulations at a reduced cost, with a comparable accuracy. 
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,-H Keywords: rarefled flow simulation, hypersonic flows, re-entry problem, transition regime, 

^ BGK model, discrete velocity model 

^ 1 Introduction 

j> The description of a flow surrounding a flying object at hypersonic speed in the rarefled 

'k^ atmosphere (more than 60 km altitude) is still a challenge in the atmospheric Re-Entry 

^ community [2] . When this flow is in a rarefled state, that is to say when the Knudsen number 

(which is the ratio Kn = j^ between the mean free path A of particle and a characteristic 
macroscopic length L) is larger than 0.01, the flow cannot be accurately described by the 
compressible Navier-Stokes equations of Gas Dynamics. In this case, the kinetic theory has 
to be used. The evolution of the molecules of the gas is then described by a mass density 
distribution in phase space, which is a solution of the Boltzmann equation. In the transitional 
regime, this equation can be replaced by the simpler Bhatnagar-Gross-Krook (BGK) model. 
In order to be able to compute parietal heat flux and aerodynamic coefficients in the 
range of 60-120 km, a kinetic description of the stationary flow is necessary. 

The most popular numerical method to simulate rarefled flows is the Direct Simulation 
Monte Carlo method (DSMC) |9j. However, it is well known that this method is very 



expensive in transitional regimes, in particular for flows in the range of altitude we are 
interested in. The efficiency of DSMC can be improved by using coupling strategies (see [121 
13]) or implicit schemes (see ^241 US]), but these methods are still not very well suited 
for stationary computations. In contrast, deterministic methods (based on a numerical 
discretization of the stationary kinetic model) can be more efficient in transitional regimes. 
Up to our knowledge, there are few deterministic simulation codes specifically designed for 
steady flows. One of the most advanced ones is the 3D code of Titarev [25] developed for 
unstructured meshes. Another 3D code has been developed by G. Brook [TU]. Other codes 
exist, but they are rather designed for unsteady problems, see for instance [201 [I] or the recent 
UGKS scheme developed by K. Xu and his collaborators [271 tlH], which is an Asymptotic 
Preserving scheme for unsteady flows. 

In our team, we developed several years ago a code to make 2D plane and axisymmetric 
simulations of rarefled flows based on the BGK model (see a description in [221 [221 H])- This 
code has recently been extended to 3D computations, for polyatomic gases. Due to the 
physical model (polyatomic gases), the space discretization (block structured mesh), and 
the parallelization (space domain decomposition with MPI and inner parallelization with 
openMP), this code is rather different from the other existing 3D codes recently presented in 
the literature for the same kind of problems (the 3D code of Titarev [25] for example), even 
if space domain decompositions have already been used for unsteady simulation (see [2U]). 

All the codes designed for steady flows have a common feature: they are based on a 
"discrete ordinate" like approach, and use a global velocity grid. This grid is generally 
a Cartesian grid with a constant step size. The number of points of this grid is roughly 
proportional to the Mach number of the flow in each direction, and hence can be prohibitively 
large for hypersonic flows, even with parallel computers. To compute realistic cases (3D 
conflgurations with Mach number larger than 20), the velocity space discretization has to be 
modifled in order to decrease CPU time and memory storage requirements. It has already 
been noticed that a reflnement of the grid around small velocities can improve the accuracy 
and reduce the cost of the computation (especially for large Knudsen numbers in flows 
close to solid boundaries, see [2S])- However, up to our knowledge, there is no general 
strategy in the literature that helps us to reduce the number of discrete velocities of a 
velocity grid for any rarefled steady flow, even if some works on adaptive velocity grids have 
already been presented: the flrst attempt seems to be [5] for a ID shock wave problem, 
and recently, more general adaptive grid techniques designed for unsteady flows have been 
presented in [211 HI [IHl HI] • 

The main contribution of this article is to propose an algorithm for an automatic con- 
struction of a locally reflned velocity grid that allows a dramatic reduction of the number of 
discrete velocities, with the same accuracy as a standard Cartesian grid. This algorithm uses 
a compressible Navier-Stokes pre- simulation to obtain a rough estimation of the macroscopic 
flelds of the flow. These flelds are used to reflne the grid wherever it is necessary by using 
some indicators of the width of the distribution functions in all the computational domain. 
This strategy allows us to simulate hypersonic flows that can hardly be simulated by stan- 
dard methods, since we are indeed able to apply our method to our kinetic code to simulate 



a re-entry flow at Mach 25 and for temperature and pressure conditions of an altitude of 90 
km. In this example, the CPU time and memory storage can be decreased up to a factor 24, 
as compared to a method with a standard Cartesian velocity grid. Note that preliminary 
results have already been presented in P and [^. 

The outline of this article is the following. In section [2| we briefly present the kinetic 
description of a rarefied gas. In section |3| we discuss the problems induced by the use of a 
global velocity grid, and our algorithm is presented. Our approach is illustrated in section |4] 
with several numerical tests. To simplify the reading of the paper, the presentation of our 
simulation code is made in the appendix. 

2 Boltzmann equation and Cartesian velocity grid 

2.1 Kinetic description of rarefied gases 

In kinetic theory, a monoatomic gas is described by the distribution function f{t, x, v) defined 
such that f{t, X, \)d:xid\ is the mass of molecules that at time t are located in an elementary 
space volume dx centered in x = (x, y, z) and have a velocity in an elementary volume dv 
centered in v = {vx,Vy,Vz). 

Consequently, the macroscopic quantities as mass density p, momentum pu and total 
energy E are defined as the five first moments of / with respect to the velocity variable, 
namely: 

(p(t,x),pu(t,x),E(t,x))= / (l,v,^|vn/(t,x,v)dv. (1) 

Jm.3 I 

The temperature T of the gas is defined by the relation E = |p|up + ^pRT, where R 
is the gas constant defined as the ratio between the Boltzmann constant and the molecular 
mass of the gas. 

When the gas is in a thermodynamical equilibrium state, it is well known that the 
distribution function / is a Gaussian function M[p, u, T] of v, called Maxwellian distribution, 
that depends only on the macroscopic quantities : 

It can easily be checked that M satisfies relations ([I]). 

If the gas is not in a thermodynamical equilibrium state, its evolution is described by the 
following kinetic equation 

5*/ + v-Vx/ = g(/), (3) 

which means that the total variation of / (described by the left-hand side) is due to the 
collisions between molecules (described by the right-hand side). The most realistic collision 
model is the Boltzmann operator but it is still very computationally expensive to use. In 
this paper, we use the simpler BGK model [H [26] 

g(/) = -(M[p,u,T]-/) (4) 



which models the effect of the coUisions as a relaxation of / towards the local equilibrium 
corresponding to the macroscopic quantities defined by ([I]). The relaxation time is defined 
as r = -^, where /z is the viscosity of the gas. This definition allows to match the correct 
viscosity in the Navier-Stokes equations obtained by the Chapman-Enskog expansion. This 
viscosity jj, is usually supposed to fit the law jj, = fj.refij?—)'^, where f^ref and T^e/ are 
reference viscosity and temperature determined experimentally for each gas, as well as the 
exponent u (see a table in [5]). 

The interactions of the gas with solid boundaries are described with some standard 
reflection models, like diffuse, specular or Maxwell reflection models. Let us suppose that 
the boundary has a velocity u^ and temperature T^. In the diffuse reflection model, a 
molecule that collides with this boundary is supposed to be re-emitted with a temperature 
equal to Ty^, and with a random velocity normally distributed around u^„. This reads 

/(t,x,v) = M[cr^„,u^,T^](v) (5) 

if V ■ n(x) > 0, where n(x) is the normal to the wall at point x directed into the gas (if 
V ■ n(x) < 0, we assume / = 0). The parameter 0"^, is defined so that there is no normal 
mass flux across the boundary (all the molecules are re-emitted). Namely, that is, 

a^ = -([ /(t,x,v)vn(x)dv)/( / M[l, u^,T^](v)v • n(x) civ ) . (6) 

\Jvn(x)<0 / V~'v-n(x)>0 / 

In the specular reflection model, a molecule that collides with the boundary is reflected with 
the same tangential velocity component, while its normal velocity component is reversed. 
That means there is no energy or momentum exchange at the wall, that is to say no heat 
flux and no shear stress. This reads 

/(t,x,v) = /(t,x,v-2(vn(x))n(x)) 

However, as those two models are extreme cases, it can be more relevant to use the Maxwell 
reflection model 

/(t,x,v) = (1 -a)/(t,x,v-2(v-n(x))n(x)) + aM[cr^,u^,T^](v), 

which is a convex combination between the diffuse and specular reflection models. 

2.2 Velocity discretization on Cartesian grid 

In most of existing computational codes for the Boltzmann equation that use a deterministic 
method, the velocity variable is discretized with a global Cartesian grid, that is to say the 
same grid for every point of the space mesh. Consequently, it is necessary to compute a priori 
a velocity grid in which every distribution function that may appear in the computation can 
be resolved (see figure [I]). This means that the grid must be: 



• large enough to contain the main part of the distribution functions for every position 
in the computational domain 

• fine enough to capture the core of the distribution functions for every position in the 
computational domain 

The corresponding parameters (bounds and step of the grid) can be determined with the 
following idea: at each point x of the computational domain, the macroscopic velocity u(x) 
and temperature T(x) give information on the support of the distribution function /(x, .). 
Indeed, if /(x, .) is "not too far" from its corresponding Maxwellian, as supposed in the BGK 
model, its support is centered around u(x), and is essentially localized between bounds that 
depend on u(x) and T(x), corresponding to the standard deviation of the Maxwellian ; that 
is to say /(x, .) is very small outside (see figured where these bounds are u(x) — ca/-RT(x) 
and u(x) + c^^/RT{x) in ID). When several distribution functions have to be discretized on 
the same grid, their supports are reasonably well approximated if the bounds are 

v:,,, = max{«"(x) + c/RTM} C = min{n"(x) - c^RT{^)}, (7) 

and if the grid step is 

Av = amm^/RT{x), (8) 

xgSl 

where il is the computational domain (in space), and a = x,y,z (see |3]). In ([7])-(|8]), the 
parameters c and a can be chosen as follows. For c, statistical argument suggest that values 
between 3 and 4 are needed, and so c = 3 seems to be a good compromise between accuracy 
and computational cost. The parameter a allows us to adjust the grid step: it must be at 
most equal to 1 (which ensures that there are at least six points inside the core of every 
distribution function), but a smaller value might be necessary for an accurate computation. 
To compute these bounds, it is necessary to a priori estimate the values of u and T in 
the computational domain. A simple way is to use a compressible Navier-Stokes (CNS) pre- 
simulation (for which the CPU time is negligible as compared to the Boltzmann simulation) . 
This simulation gives the fields u"-^^"^ and T^^^ in Q, and the bounds and the grid step can 
be defined by formulas ([7])-(|8]) applied to these fields, that is to say 



v'^a. = max{«"'^^^(x) + cv/i?T^^^(x)} t;1„ = min{«°'^^^(x) - c^/RT^^^^)}, (9) 

and 

Av = a min ^/?T^^^(x). (10) 

xef2 

However, the values of the bounds are mainly determined by the temperatures in the 
shock zone, which can reach thousands of Kelvin. One must be careful that, at high altitude 
and high speeds, numerical computations of Navier-Stokes equations can underestimate those 
temperatures, which would lead to inappropriate bounds. 

It is quite clear that the size of the velocity grid increases with the Mach number. Indeed, 
a large Mach number implies large upstream velocities and large temperature in the shock 



wave, which lead to very large bounds (see (7])), while the temperature of the body remains 
small, thus leading to small grid step (see (IS)). For realistic re-entry problems, this can lead 
to prohibitively large grids. For instance, for the flow around a cylinder of radius 0.1 m at 
Mach 20 and altitude 90 km, formulas (|9| and ( 10 ) used with a Compressible Navier-Stokes 



pre-simulation lead to a velocity grid of 52 x 41 x 41 points and around 350 GB memory 
requirements with a coarse 3D mesh in space. 

However, this is mainly due to the use of a Cartesian grid with a constant step. In order 
to decrease the size of the grid, it is attractive to refine it only wherever it is necessary, and to 
coarsen it elsewhere. In the following section, we show that this can be done automatically 
with our new algorithm. 

This approach may not be well designed for cases where / is very far from it correspond- 
ing Maxwellian (like in shock interactions problem). However, for reentry problems, our 
assumption is realistic for transitional regime we are interested in. 

3 An algorithm to define locally refined discrete veloc- 
ity grids 

Since we have to represent many distribution functions on the same grid, it is natural to 
refine the grid in the cores of theses distributions, and to coarsen it in the tails (see figure |4]). 
To achieve this goal, we define the concept of "support function" , and we use it to design an 
AMR (Adaptive Mesh Refinement) velocity grid. 

3.1 The support function 

At each point x of the computational space domain Q, u(x) and T(x) give information on 



the "main" support of the distribution function 



X, .) in the velocity space: this support 



is centered at u(x) and of size 2cyi?T(x) (see ul))- We define the support function (p in 
the velocity space as follows: for every v in the velocity space, we set 0(v) = ^^/R^{x) if 
there exists in ^2 a x such that ||v — u(x)|| < c\/RT{k), that is to say that v is inside the 
support of a distribution, considered as a sphere of center u(x) and of radius ca/-RT(x). This 
function gives an incomplete mapping of the velocity domain (see figure [s]). Now we want 
to extend this function to the whole velocity space, so that it can be used as a refinement 
criterion to design an optimal velocity grid. 

Indeed, so far, this function is not defined for every v, especially for large ones, as we 
cannot find for every v an appropriate pair (u(x), r(x)) to match the definition of the support 
function. Moreover, this function might be multivalued, as there might be two different 
space points Xi and X2 with same velocity u(xi) = u(x2), but with different temperatures 
T(xi) 7^ T{'K2). Since our goal is to obtain for every v the minimum size of the support of 
every distribution centered around v, these two problems can be avoided as follows : 

(a) we set 0(v) = min(A/-RT(xi), a/-RT(x2)) if ||v — u(xi)|| < Ca/-RT(xi) and ||v — 



u(x2)|| < c^RT{-K2) with (u(xi),T(xi)) ^ (u(x2),T(x2)). Indeed, the size of the 
grid around v is constrained by the distribution centered on v that has the smallest 
support, hence must have the corresponding smallest possible value. 



(b) if there is no x such that ||v — u(x)|| < Ca/-RT(x), then v is in the tail of every 
distribution function, and there is no reason to refine the grid there, so we set 0(v) to 
its largest possible value 0(v) = max ■\/i?T(x). 

These ideas lead to the following algorithm for an automatic construction of 0. Note that 
below, the computational domain in space is supposed to be discretized with a mesh com- 
posed of cells numbered with three indices («, j, k). This is not a restriction of our algorithm. 



Algorithm 3.1 (Construction of the support function). 

1. CNS velocity and temperature are stored in arrays u(i, j, k) and T{i,j, k) for i,j, k = 

J- • '^max) ]max) '^max- 

2. construction of a (fine) Cartesian velocity grid (based on the previous global crite- 
rion ([9])- ([To])): points v(g) with g = : q^ax- 



3. computation of the field VRT, stored in an descending order in the ID array iIj{I), for 

1 = 1: imax X jmax X kmax, SO that ^(/) = ^J RT{{i, j,k){I)) . 

4. initialization of the array 0(0 : Qmax) = max(-0) on the velocity grid (one value per 
discrete velocity) 

5. do I = 1 : imax X jmax X kmax (loop ou the ccUs of the space mesh) 

do q = : qmax (loop on the nodes of the velocity grid) 

— if v(g) is in the sphere of center u((i, j, k){I)) and radius cip^I), then it is in 
the support of a distribution function, and we set 0(g) := ip{I) 

This algorithm ensures that the array satisfies the following property. 

Property 3.1. For every q between and qmax, we have : 

0(g) = mini min< \/RT{i,j, k) such that ||v(g) — u(i, j, A;)|| < c\/RT{i,j, k) L 

max \/RT{i,j^ k) 

This means that if we take a velocity v(g) of the fine grid, then among all the distribution 
functions whose support contains v(g), one of them has a smallest support, and 2c0(g) is the 
size of its support. This support function hence tells us how the fine Cartesian grid should 
be refined, or coarsened, around v(g). 

The generation of the corresponding adapted grid is described in the following section. 
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Remark 3.1. It may happen that with such a strategy, the support function does not take 
sufficiently into account the support of the wall Maxwellian (for diffuse reflection conditions 
on a solid obstacle): this Maxwellian is centered around u = and has a temperature 
Twaii which is often the smallest temperature of the computational domain. This can be 
the case when the macroscopic CNS flow is not resolved enough close to the wall, or if the 
CNS equations are solved with slip boundary conditions. In this case, it is safer to add the 
element (p^qmax + 1) := \/RTwaU in and to add the corresponding velocity ^{qmax + 1) := 
in the fine grid. 

3.2 AMR grid generation 

The idea is to start with a unique cell defined by the bounds of the fine velocity grid (that 
is the full velocity domain), and then to apply a recursive algorithm: each cell is cut if its 
dimensions are larger than the minimum of a0 in the cell. The algorithm is the following: 

Algorithm 3.2 (Recursive refinement of a cell C). 

1. compute the minimum of the field a0 in the cell C, that is to say the minimum of the 
elements of acf) that have the same indices as the discrete velocities of the fine grid 
included in this cell C: 

m := amin{{/)((3'), such that v(g) G cell C} 

2. if one edge of C is larger than m, then the cell is cut into 8 new subcells on which the 
refinement algorithm is applied, recursively. 

3. else, the cell is kept as it is, and the cell and its vertices are numbered. 

At the end of the algorithm, the final grid satisfies the following property: every cell has 
a size smaller than ax the minimum of the support function in the cell. 

3.3 An example 

We anticipate on the numerical results that will be discussed in section |4] to illustrate the 
previous algorithms. 

The test case is a steady flow around a cylinder at Mach 20. A CNS pre-simulation and 
the use of formula (|9|-(10) with parameters c = 4 and a = 2 give us a fine velocity grid 



with 52 X 41 points, see figure pi Algorithms 3.1 and 3.2 applied to the CNS fields give the 



support function and the AMR velocity grid shown in figure [7j The AMR velocity grid has 
529 points, which is one fourth as small as the original fine Cartesian grid. 

Note that the AMR grid is refined around the zero velocity and the upstream velocity: 
these velocities correspond to the fiow close to the solid boundary and to the upstream 
boundary, where the temperature is low, thus where the distribution functions are narrow, 
and hence where, indeed, we need a fine velocity resolution. At the contrary, the grid is 



coarse close to its boundaries: the discrete velocities are very large there and are all in the 
tails of all the distribution functions, where, indeed, we do not need a fine resolution. 
The accuracy of the computation on this AMR grid is studied in section |4} 

3.4 Discrete Velocity Approximation on the AMR velocity grid 

When one wants to transform a standard discrete velocity method based on a Cartesian grid 
to a method using our AMR grid, two points have to be treated carefully: the computation 
of the moments of /, and the approximation of the collision operator. In this paper, we only 
use the BGK collision model of the Boltzmann equation: therefore its approximation (based 
on the conservative approach of [23]) reduces to the problem of a correct approximation of 
the moments of /, as it is described below. 

First note that the standard discrete velocity approach consists in replacing the kinetic 
equation ^ by the following discrete kinetic equation 

5t/, + v,-Vx/, = g,(/), (11) 

where fq(t,x) is supposed to be an approximation of fit^x^Wq) for every discrete velocity 
Vg, with g = to qmax- The discrete distribution function is / = {fq{t,x)), and Qq is the 
discrete collision operator. The moments of the discrete distribution function are obtained 
by using a quadrature formula to replace ([I]) by 

Qmax -| 

(p(t, X), pu(t, X), E{t, X)) = ^(1, Vg, -|VgP)/q(t, X.)ujq, 

9=0 

where (uq) are the weights of the quadrature. 

For the BGK model (ll]), the conservative approximation of [23j gives the following discrete 
BGK collision operator 

Qqif) = kMq[p,U,T]-fq), 

where Mq[p,u,T] is the discrete Maxwellian that has the same discrete moments as /. It 
can be shown that there exists a G M^ such that Mq[p, u, T] = exp(a ■ m(vg)), that is to say, 
a is the unique solution of the nonlinear system 

Qmax 

^ m(vg) exp(a ■ m{-Vq))ujq = 

9=0 

where we note m(v) = (1, v, ^|vp)"^. This system can be solved by a Newton algorithm (see 
details in [23] and a more economic version of the algorithm due to Titarev in [2^). 

Consequently, to apply this approach to our AMR velocity grid, we just have to chose a 
quadrature formula. First, we propose the standard Qi bilinear interpolation of / on each 
cell of the AMR grid (by using the four corners of the cell). In this case, the quadrature 

9 




points are the nodes of the grid, and so are the discrete velocities that are used in the 
transport term. The quadrature weights are 

cell Ci 9Vq 

where d = 3 in 3D or 2 in 2D, and \Ci\ is the volume (or the area in 2D) of Ci. 

The number of discrete velocities can be decreased if we take a simpler Pq (constant per 
cell) interpolation. Here, the quadrature points are the centers of the cells, and so are the 
discrete velocities Vg that are used in the transport term. The weights are simply the volume 
(or the area) of the cells: Uq = \Cq\. Note that the number of discrete velocities that are used 
with this approach is equal to the number of cells of the AMR grid, which is smaller than 



the number of nodes, see section 42 We advocate the use of this latter approach, since it 
allows to save a lot of computer memory, and since we observed with our tests that the Qi 
interpolation does not give more accurate results. 

Remark 3.2. Our AMR velocity grids are often very coarse at their boundaries (since veloc- 
ities are very large there, and are in the tails of all the distribution functions). Consequently, 
passing from Qi to a Pq grid may lead to a grid which is not large enough. Indeed, if the Qi 
grid is of length L (in the x direction, for instance) and its outer cells are of length /, then 
the length of the Pq grid now is L — /. If / is large, the Pq grid is not large enough. In that 



case, it is thus safer to modify step 2 of Algorithm |3.1| by increasing the bounds of the fine 
Cartesian grid: f^^^ and f^j„ are replaced by t>^^^ + Av and f^j„ — Av for example, where 
Av is the step of the fine grid. 

3.5 AMR grid generation for axisymmetric flows 

Many interesting 3D flows have symmetry axis (like flows around axisymmetric bodies with 
no incidence). In that case, if we assume that the symmetry axis is aligned with the x 
coordinate, it is interesting to write the kinetic equation in the cylindrical coordinate system 
(x, r, ip), since the distribution function / is independent of (p, and we get 

dtf + vAf + C cosujdrf - ^^^^dj = g(/), (12) 

r 

where ( and u are the cylindrical coordinates of the velocity in the local frame (e,., e^), that 
is to say v ■ Cr = C cos u and v ■ e^ = (sinu. 

If the velocities of the upstream flow and of the solid boundaries have no component in 
the if direction, then / is even with respect to u. Indeed, using the Galilean invariance of 



equation (12), we apply the symmetry with respect to the (x, z) plane to the equation to get 
that f{t, X, r, Vxi C, — w) is also a solution. 

In this case, the discrete velocity grid must be constructed for variables {vx,(y'^): where 
( is non negative, u is in [0,7r], and Vx can take any value. The construction of this grid is 
slightly different from the Cartesian case. We briefly comment on these differences in this 
section. 
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First, bounds in Vx and C, directions can be easily obtained and we set 

Vx,max = maxJM^^^ + ca/RT}, Vx,min = minJM^^'^ - ca/RT}, 

(max = max{M^^^ + Ca/RT}, Cmin = 0, 

where Ux and Ur are the macroscopic velocities in the axial and radial directions. Since u is 
a bounded variable, there is nothing else to do at this level. 



The steps Avx and A( of the initial grid are obtained by (10) for Vx and (. However, 
the choice of the step in the cu direction is not obvious. Indeed, this variable has no link 
with the width of the distribution. Moreover, the grid in this direction must be fine enough 
to capture the derivative 9^/, while in the other directions, only some moments have to be 
captured. After several tests (and from the experience made in |22|), it seems that a grid 
step Au such that there are 30 points uniformly distributed in [0, vr] is a good compromise 
between cost and accuracy. 

To refine/coarsen this initial cylindrical grid, we choose to use the procedure described 



in the steps of sections 3.1 and 3.2 in the plane (vx, C) to generate a two-dimensional AMR 
grid. Indeed, for the same reasons as mentioned above, there is no reason to refine or coarsen 
the grid in the u direction. Then, the complete grid is obtained by a rotation of the two- 
dimensional AMR grid in the oj direction, with the same step Au as in the initial grid. This 
procedure is simple and turned out to be very efficient. 



Since the AMR grid is uniform in the u direction, the discretization of (12) is made with 
the approximations derived in j23j . 

4 Numerical results 

4.1 The 3D kinetic code 

Our 3D kinetic code is written in Fortran, and can deal with 2D planar flows, or 2D axi- 
symmetric flows as well as 3D flows on multiblock curvilinear structured grids. A steady 
state solution of the BGK equation for monoatomic or polyatomic gases is computed with a 
linearized implicit finite volume scheme, based on a velocity discretization which is conser- 
vative and entropic (see |s22j). In axisymmetric cases, a specific scheme (named T-UCE) is 
used in order to ensure a conservative and entropic discretization of the transport operator 
(see [23|). 

In order to deal with high computational costs (in 3D configurations for example), an 
hybrid parallel implementation is used: a domain decomposition in the position space used 
with the library routines specified by the Message Passing Interface (MPI) can deal with a 
large number of mesh cells. Note that this is the opposite of the strategy used in |25] (which 
uses a domain decomposition in the velocity space). Moreover, the OpenMP library is used 
with 8 threads for loops computations that are local w.r.t the position (computation of the 
moments, Maxwellian, etc.), thus large numbers of velocity points can be reached. We also 
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use OpenMP for loops that are local w.r.t the velocity, like the computation of the transport 
terms, for instance. The communication between different domains are treated as explicit 
boundary conditions. This code has been run on the super-computer TERA-100 of the CEA 
(that has around 140 000 cores Intel Xeon 7500), by using at most 1600 cores. 

Since a general description of our code is not the main subject of this paper, we do not 
give more details here. However, for completeness, note that interested readers can find a 
more detailed description of the code in appendix |Xj 

The distribution is initialized by using a CNS pre-simulation (also used for the AMR 
strategy): the number of iterations before reaching convergence is then considerably de- 



creased (see section 4.5). 

For the 2D plane examples shown below, we use a simpler version of the code, called 
CORBIS. This code is based on the same tools but makes use of the reduced distribution 
technique to reduce the velocity space to 2D only. This code only uses OpenMP directives 
to work on parallel computers (see |1]). It has been used on a SMP node of 48 cores AMD- 
Opteron-8439-SE. 

4.2 A 2D plane example 

Here, we illustrate our approach with the simulation of a steady flow over a cylinder of 
radius 0.1m at Mach 20, for density and pressure of the air at an altitude of 90 km. The 
gas considered here is argon (molecular mass= 6.663 x 10^'^^kg). Namely, we have p = 
3.17 X IQ-^kg/m^, u = 5.81 x lO^m/s, and T = 2A2AK. 

The space mesh uses 50 x 50 cells, with a refinement such that the first cell at the solid 
boundary is smaller than one fifth of the mean free path at the boundary. 

A CNS pre-simulation and the use of formula (p 10) with parameters c = 4 and a = 2 



give us a fine velocity grid with 52 x 41 points (bounds ±11500 and ±8 900 for Vx and w 



y 



with a step of 450, in m.s~^), see figure^ Algorithms 3.1 and 3.2 applied to the CNS fields 
give the support function and the AMR velocity grid shown in figure [7j The AMR velocity 
grid has 529 points, which is one fourth as small as the original fine Cartesian grid. This 
gain can be further increased if we define the discrete velocities as the centers of the cells of 
the AMR grid rather than its vertices. This gives 316 discrete velocities, hence with a gain 
of 6.7 times instead of 4. 

First, we compare the CPU time required by the code CORBIS with the different velocity 
grids. While the number of iterations to reach steady state is approximately the same with 
both grids, the CPU time required by the original fine Cartesian velocity grid is around 7 
times as large as with the new AMR grid, which is almost the same ratio as the ratio of the 
number of discrete velocities. The memory required with the uniform grid method is around 
170 MB whereas with the use of the AMR grid, only 25 MB of memory storage is required. 
This shows that the new AMR grid leads to a real gain both in memory storage and CPU 
time. 

Then, we compare the accuracy of the results with the two grids for the macroscopic 
quantities. We compute the normal component of the heat flux to the boundary, which 
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is a quantity of paramount importance in aerodynamic simulations: we find a maximum 
relative difference lower than 5%, which is reasonably small (see the profile of this flux on 
figure [s]). We also compute the differences for the density, temperature and pressure in the 
whole computational domain: 

• the mean quadratic relative difference over all the cells of the computational domain is 
5% for the density, 1% for the temperature, 0.6% for the horizontal velocity, and 1.2% 
for the vertical velocity; 

• the maximum relative difference on each cell of the computational domain is 40% for 
the density, 69% for the temperature, and 157% for the velocity. 

The maximum relative differences are observed at the solid wall for the density and the 
velocity, and in the upstream flow for the temperature. 

This difference is quite large, and can be explained as follows. The smallest cells of our 
AMR velocity grid (that are around small velocities, like velocities at the solid wall) turn out 
to be smaller than the cells of the Cartesian grid (size 330 instead of 450). This means that 
our results with the AMR grid are probably more accurate than the results of the Cartesian 
grid. Consequently, the Cartesian grid results should not be considered as a reference for 
this comparison. 

To confirm this analysis, we make a new simulation with a Cartesian grid with a uniform 
step of 330 (like the smallest step of the AMR grid). We observe that relative differences 
between the Cartesian grid and the AMR grid are much smaller: 

• the mean quadratic relative difference over all the cells of the computational domain 
is 2% for the density, 0.5% for the temperature, 0.3% for the horizontal velocity, and 
0.7% for the vertical velocity; 

• the maximum relative difference on each cell of the computational domain is 12% for 
the density, 13% for the temperature, 80% for the velocity. 

The maximum relative difference is still too large for the velocity (80% at the boundary), 
but this quantity is very small in this zone and probably requires smaller velocity cells (that 
is a smaller parameter a for both grids). However, this inaccuracy does not deteriorates the 
results on the other quantities, in particular for the heat flux at the boundary. Indeed, the 
comparison of the heat fluxes is even better, since we find a relative difference lower than 
2.5%, which is excellent. The comparison in terms of CPU time and memory storage is of 
course more favorable to the AMR grid here. 

Note that we also did the same computations with an finer AMR grid obtained with the 
parameter a = 1: the difference with a = 2 is less than 2% for the heat flux at the solid wall, 
which is quite good. The value a = 2 is then clearly sufficient. However, if one is interested 
with the flow velocity at the boundary, we advocate to use a = 1, since the differences here 
are around 15%. 
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4.3 A 2D axisymmetric example 

Here we consider the flow around a sphere of radius 0.1 m, at Mach 20, for density and 
pressure of the air at an altitude of 90 km. The gas is air (molecular mass= 4.81 x 10~^^ 
kg), considered as a diatomic gas. Namely, the upstream density and pressures are p = 
3.17 X 10~^kg/m^, and p = 0.16N.m~'^, and the temperature of the surface of the sphere is 
T = 280K. 

The space mesh uses 70 x 50 cells, with a refinement such that the first cell at the solid 
boundary is smaller than one fifth of the mean free path at the boundary. A CNS pre- 



simulation and the use of formula (13) with parameter c = 4 give us the following velocity 



bounds —8 700 and 11000 for Vx, and 7 500 for (, and the bounds for co are always and 



71, as explained in section 3.5 



All the following simulations were made with 140 domains and 4 OpenMP threads per 
domain. 



First, a fine uniform cylindrical velocity grid is obtained with (10) and a = 2: this gives 
65 X 33 X 31 = 66 495 discrete velocities. The steady state is reached after 1828 iterations, 
in 5 476 s. The corresponding temperature and velocity fields are shown in figure [9] 

Then a cylindrical AMR grid is generated (with a = 2): it gives 7691 discrete velocities 
(with the Qi points), the steady state is reached after 1899 iterations, in 710s. The support 



function and AMR grid are shown in figure 10 Here, the gain factors in memory and in 



CPU time with respect to the uniform grid are around 9. 

Now, we use the same AMR grid, but with the Pq points (that is to say the centers of the 
cells), which gives 6 900 discrete velocities only. The number of iterations is close (1796), 
and the CPU time is 562s. Here, the gain factors in memory and CPU time, with respect 
to the uniform grid, are around 10. 



Finally, we compare in figure [TT] the normal heat flux at the boundary for the uniform grid 
and the AMR grid generated by using the CNS fields (Qi version). As in the 2D example, 
we find that the fluxes are very close (with a relative difference which is less than 1%). We 
get the same results with the Pq version. 

Consequently, the AMR grid (Pq version) is the most efficient here: it gives the same 
results as the fine uniform grid, for computational and memory cost which is one ninth as 
small. 

Remark 4.1. Before discussing a full 3D case, we briefiy mention another experiment we 
have done here. We have tried to see if the use of a CNS pre-simulation could be avoided 
by using a direct estimation of the extreme values of the velocity and the temperature 
given by the Rankine-Hugoniot relations. Indeed, before the computation, we know the 
upstream values u„p and T„p, and we can compute the downstream values {udown,Tdown) = 
RH{uup, Tup) by the Rankine-Hugoniot relations for a stationary shock. Then we have three 
set of different values: upstream, downstream, and wall values (u^^// = 0,Tu,aii), and we can 



use these three sets in our algorithm 3.1 instead of the CNS fields (u'-^^'^,T*^^'^). 

With the same test case, this strategy gives a grid with 3 962 points only (with a = 2), 
which is half as less as with the CNS fields. The support function and AMR grid are shown 
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in figure 12 of course, since we only liave tliree values for the macroscopic fields here, the 
grid is not as smooth as with the CNS fields. 

Moreover, the convergence to steady state is much slower (since we cannot use the CNS 



solution as an initial state, see section 4.5 for a comment on the initialization). More im- 
portant, the accuracy observed on the heat flux is not as good as with the CNS fields: the 
difference with the fine uniform grid is here around 8% instead of 1%. 

Consequently, even if strategy is simple and does not require a CNS solver, it seems to 
be not accurate enough. 

4.4 A 3D example 

Here, we consider the flow around a 3D object composed of cone with a spherical nose, with 



no incidence. The physical parameters are the same as in section 4.3 



The space mesh uses 50 000 cells, with 50 in the direction normal to the surface. 



A CNS pre-simulation and the use of formula (13) with parameter c = 4 and a = 2 give 
us a uniform Cartesian grid with 65 x 33 x 33 = 70 785 discrete velocities. This is much 
too large to make the simulation (the memory storage itself is huge). Then we only do the 
simulation with the AMR grid (Pq version) generated by our algorithm which has only 2 956 
points. 

The gain factor in memory storage is 22. The steady state is reached in 3 424 iterations 
and 3 566s with 400 domains and 4 threads per domain. 

To estimate the gain in CPU that would be obtained if the uniform Cartesian grid could 
be used, we use the following remark: on the previous tests (2D plane and axisymmetric), 
the number of iterations to reach the steady state is almost the same for the uniform and 
the AMR velocity grids. Then we make a single iteration with the uniform Cartesian grid 
(same domain decomposition) and we multiply the CPU time obtained for this iteration by 
the number of iterations required with the AMR grid. We find a CPU time which is 24 times 
as large as with the AMR grid, which is quite important. This CPU gain is larger that the 
gain observed for the memory storage. 

In figure 13, we show the uniform Cartesian grid and the AMR grid. In figure 14, we 
show the pressure field computed by the code with the AMR grid, and the heat flux along 
the surface is shown in figure [15} 



4.5 Initialization with the CNS results 

Since we use a CNS pre-simulation to build our discrete velocity grid, it is interesting to use 
it also to initialize the BGK computation, what was done in the previous simulations. It 
helps to reach the steady state more rapidly, as it is shown in the following example. 



We take the same test case as in section 4.3, with the uniform velocity grid. In a first 



computation, the distribution is initialized with the (Maxwellian) upstream flow (this is the 
standard initialization), and the scheme converges to the steady state in 3 549 iterations. 
Then we do the same computation, but now the initial state is the Maxwellian distribution 
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computed with the macroscopic variables that are given by the steady CNS solution. Then 
the steady state is reached in 1 822 iterations only. 

A similar gain (half as many iterations as with the standard initialization) is obtained in 
all our test cases. 

5 Conclusion and perspectives 

In this paper, we have proposed a method to refine and coarsen a global velocity grid of a 
discrete velocity approximation of a kinetic equation. It is based on a criterion called "the 
support function" that links the local size of the velocity grid to the macroscopic temperature 
an velocity of the fiow. Our algorithm uses the macroscopic fields given by a compressible 
Navier-Stokes pre-simulation to automatically generates an optimal velocity grid. 

This approach has been tested in a 3D computational code (and 2D plane and axisym- 
metric versions) which uses the BGK model of rarefied gas dynamics. Typical test cases in 
hypersonic re-entry problems (for simplified geometries) have been used. We observed that 
using our algorithm allows important gains both in CPU time and memory storage, up to 
a factor 24. This allows to make simulations that are hardly possible with standard grids, 
even on super computers. 

An obvious, and straightforward, extension of this work will be the use of modified BGK 
models (like ES or Shakhov models), in order to have a model with a correct Prandtl number. 

Moreover, we will try to investigate non isotropic AMR velocity grids by taking into 
account translational kinetic temperatures in different directions, that can be obtained by 
the stress tensor computed by the CNS simulation. 

Our main short term project is to improve the physical accuracy of our code by imple- 
menting a simple BGK-like model for multi-species reactive flows. An application of our 
method to the full Boltzmann collision operator is less obvious but might also be explored. 

Acknowledgment. Experiments presented in this paper were carried out using the PLAFRIM 
experimental testbed, being developed under the Inria PlaFRIM development action with 
support from LABRI and 1MB and other entities: Conseil Regional d'Aquitaine, FeDER, 
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A Overview of the 3D kinetic code 

A.l The linearized implicit scheme 

Our code is an extension of the code presented in [23] to 3D polyatomic flows. It is based 
on the following reduced BGK model 

5t/ + v-V./ = -(M(f/)-/) 
r 

dtg + ^^ ■ V^g = -dRTM{U) - g), 

T Z 

where U = (mf + e'^^-'^') = {p, pu, E = |p|up + ^^pRT) is the vector of macroscopic mass, 
momentum, and energy density. Here, we use the standard notation (.) = Jj^g . d\ for any 
vector valued function of v, m{v) = (1, v, ||vp) is the vector of collisional invariants, and 
e(^) = (0, 0, 0, 0, 1). Moreover, M{U) is the standard Maxwellian distribution defined through 



the density, velocity, and temperature corresponding to the vector U above (see (|2j) 

This model comes from the reduction of a BGK model for the full distribution function 
F(t, X, V, /), where / is the internal energy variable, and 5 is the number of internal degrees 
of freedom (see [T7] for the first use of this technique and |3] for an application to polyatomic 
gases). Consequently, it accounts for any number of internal degrees of freedom. For instance, 
a diatomic gas can be described with 6 = 2. 

This model is first discretized with respect to the velocity variable. We follow the ap- 
proximation of [23] and its extension to polyatomic gases [16] . We assume we have a velocity 
grid {vg,g = : qmax] (like the grids described in this paper). The continuous distributions 
/ and g are then replaced by their approximations at each point Vg, and we get the following 
discrete velocity BGK system 



dtfq + Vq 


■ Vx/, = 


= -{M,{U) - 


-u: 


dtgq + Vg 


■ Vx^g = 


-\{Nq{U)- 


-9q) 



where {Mg{U),Ng{U)) is an approximation of {M{U),^RTM{U)) that has to be defined. 
As explained in this paper, we assume we have quadrature weights {ug} corresponding to 
our discrete velocity grid, so that the moment vector of the discrete distributions is 

Qmax 

[/=^(m(v,)/g + e(5)(7>g. 

q=0 

As proposed in [TU], the discrete equilibrium {Mg{U) , Ng{U)) is constructed so that it has 
the same moments as {f,g), that is to say 

Qmax 

J2 {m{wq)Mq{U) + e^'''^Ng{U))ujg = U. (14) 
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In our code, {Mq{U), Nq{U)) is determined through the entropic variable a such that Mq{U) = 
exp(a ■ m(vg)) and Nq{U) = f z^exp(a • m{\-q)), by solving (14) by a Newton algorithm. 



-"5 ■ 

We mention that the computational cost of this algorithm canoe significantly reduced by 
using a nice idea due to Titarev [25]. This optimization will be used in a future version of 
our code. 

The discrete velocity BGK system is then discretized by a finite volume scheme on a multi- 
block curvilinear 3D mesh of hexahedral cells ^ijk, with indices i,j, k = 1 to imax,jmax, k^ax, 
respectively. Denoting by /j";.^ an approximation of the average of / at time t„ on a cell flijk 
at the discrete velocity Vg, our scheme reads, in its implicit version. 



fn+l £r 

J ilka Ji 



Hjkq Jijkq / .^ /-n+lX _ ^ t n/T nTn+l\ /■n+l^ 



tjk 



A+ + r 9 ^^y<? )ijk~ ^n+iy^^gy^ijk ) yijkq)^ 

'ijk 



where 



q;^r = EM^^)/5J+^^'^^: 



(5)^n+l\ 



Mjkql^Q- 
q=0 

The discrete divergence (v^ ■ Vx.fq)^k ^^ given by the following second order upwind approx- 
imation (with the Yee limiter [2^]): 

(15) 
where 

+ |v,..,^.,^,,|minmod(A:^,A«;^,A;:;|) 

is the second order numerical fiux across the face between ^ij^k and ^7^+1^-^^, and z/j^i ^j, is 
the normal vector to this face directed from f^jj,fc to f2j+i j ^ while its norm is equal to the 
area of the face. In the minmod limiter function, we use the notation A""*"!^ = fj'tl .• i. „ — fT^l ■ 

*+| •' 'i + J- Ji'^iy •/ iJKq 

Finally, we use the standard notation a^ = (a ± |a|)/2 for every number a. The numerical 
fiuxes across the other faces are defined accordingly. For the sequel, it is useful to denote 
by (vg ■ Vx/^^^).., the corresponding hnear first order upwind discretization (that is 

to say, defined by (fTsl) where the minmod term is set to 0). 

Note that the boundary conditions are treated by the standard ghost cell technique 
(see [23]). 

The advantage of the time implicit approximation is that it ensures unconditional sta- 
bility, which allows us to take large time steps, and hence to get rapid convergence to the 
steady state. Of course, this scheme is too expensive, since it requires to solve a non linear 
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system at each time iteration. Therefore, we rather use the following linearization. First, 
the Maxwellian is linearized by a first order Taylor expansion: 



«j« 



and the same for NqiU^J. ) (see section A. 3 for explicit expressions of the Jacobian matrices). 



Then the discrete divergence, which is not differentiable due to the limiter, is linearized by 
using the corresponding first order upwind approximation. This is sometimes called a "frozen 
coefficient technique" , which gives 



iQHjk V? "xKJq I \Jq J Q ' ' I ijk 

Kjk '^ \^1 ' ^^^Jq ■'I'Ujk 



K-Vx/;),,, + K-v.(/r^ -/;)),"* -'^^^ 



Then, denoting by 5fij^k,q = fijtq ~ fljkq (same notation for g), and by SUij^k the moments 
of {Sfij^k, Sgi,j,k), our scheme reads in the following form: 

%^ + (v, ■ VJf,)l;l -'- - -^duM,{U-,){SU,,,,) = RHSf-^^, (16) 

%fi^ + (v, ■ VJ9,)];l -'- - -^duNqiU^^,)i5U,,,,) = RHSgl^^,, (17) 

where the right-hand sides are given by 

RHSf-^^, = - K ■ VJ:;) + -^(M,(f/5,) - f-,^ 



ijk 

1 



RHSgl^^, = - K ■ Vx<)^,, + -^{Nq{Ur^,) - g^, 



i]k ^' 



kq) 



ijk 



If our scheme converges to steady state, then the right-hand side is zero, and we get a 
second order discrete steady solution. 

A. 2 Linear solver 

At each time iteration, our linearized implicit scheme requires to solve the large linear sys- 



tem (16-17). It is therefore interesting to write it in the following matrix form: 



\At 



+ T + R'']5F = RHS'', 



matrix, T is a matrix such that 



where 6F = {6fij^k,q, Sgi,j,k,q) is a large vector that stores all the unknowns, I is the unit 
latrix such that 

(/ \ 1st order / \ 1st order\ 

V,-Vx<5/J A^q-VMq).. , (18) 
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i?" is the relaxation matrix such that 

and RHS" = [RHSf^^^^.^RHSglJ. 

For simphcity, we use exphcit boundary conditions, which means SFij^k,q = in ghost 
cells. This implies that T has a simple block structure, which is used in the following. 

The algorithm used in our code is based on a coupling between the iterative Jacobi and 
Gauss-Seidel methods. First, the relaxation matrix _R" is splitted into its diagonal A" and 
its off-diagonal —E^, so that we get the following system (this the Jacobi iteration): 

(-^+T + aA 5F = RHS" + E'^dF. (20) 

Note that E^ is very sparse: the product E"'6F is local in space and can be written 

. The left-hand side of this system has a three level tridi- 
ve it by using a line Gauss-Seidel iteration which is described 



A.3 



[ETj,k^Fij,k]q (see section 
agonal block structure. We so 
below. 

With curvilinear grids, in many cases, and in particular for re-entry problems, the flow 
is aligned with a mesh (generally aligned with solid boundaries), which means that the 
largest variations occur along the orthogonal direction, say the direction of index i, for 
instance. The idea is to use an "exact" inversion of T along this direction. This is done 
with a sweeping strategy sometimes called "Gauss-Seidel line iteration". First, we rewrite 



system ( |20| ) pointwise, as follows 
1 



— + A + B + C + A^^. ,^ 1 5F,,,- fc,, + A-5F,+i,,,k,g + A+5F,_i,,- fc,, 

+ B 6Fij + i^k,q + B 6Fij^i^k,q + C 6Fij^k+l,q + C SFij^k-l,q = RHS^ji^g + [E'^j^i^6Fij^k]q, 

where coefficients A,B,C,A^,B^,C^ are standard notations for line-Gauss-Seidel method 



and can be easily derived from (20), (18), and (15). Then the linear solver reads as shown 
in Algorithm [T] 

Remark A.l. 1. We observe that we get a fast convergence to steady state by using a 
few step of this linear solver (say P = 2 or 3). This is due to the "exact" inversion 
along the direction of largest variation of the flow. 

2. In practice, we add one back substitution step right after each forward substitution 
step. 

3. This solver is a straightforward extension of the linear solver used in [23] • More so- 
phisticated versions of this solver could be tested. 
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Algorithm 1 Jacobi/line-Gauss-Seidel algorithm 

(5_p(o) = % initialization 

for p from to P do % iterations of the solver 

for q from to Qmax do % loop over the discrete velocities (Jacobi loop) 

% one sweep in j direction (forward substitution) 
for k from 1 to k^ax do 
for j from 1 to jmax do 

solve exactly the tridiagonal system 



/ 1 

[At 


+ A + B + AI 


I 
,j,k,q 


"^i,j,k,q 


+ A- 


-.pipH) 






= RHS^jf,^^ + 


n. 


kSFl%], 


-B^ 


^.pip+h) 

^^i,j-l,k,q 


-B- 


'^'^i,j+l,k,g 


end for 
















end for 

















% one sweep in k direction (forward substitution) 
for j from 1 to jmax do 
for k from 1 to kmax do 

solve exactly the tridiagonal system 



(i^ + ^ + ^ + ^".m) SFJ^l + A-5i>tS. + ^-"^^-i 



(p+i) 

d,k,q 



RHS^jf^g + [E^j^^.6F.jJq - C^SF^j,^_^^g - C SF^^^^l^^ 



end for 
end for 
end for 



compute the moments of SF^^^^^ % for the action of A^ij^k,q '^'^'^ ^tiM ^'^^^ 5F(p+i) 



end for 

set 5F = 5F^P+^^ 
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A. 3 Jacobian matrices and the relaxation matrix i?" 

Elementary calculus gives the following formula: 

duM,{U) = M,{U)m{^,)A{U)-^ 

duN,{U) = iV,(f/)(mK) - ^e(^) )A{U)-\ 

where A{U) is the following 5x5 matrix 

Qmax / 1 \ 



Consequently, by using the definition (19) of the relaxation matrix i?", a careful algebra 
shows that its diagonal elements A" ■ ^ can be separated into the following two blocks 

(A"'4 ,A"'^, ): 



K:L = Z^ {M,[U:^,]m{y,)A{U-,)-'m{y,fuj, - l) 



T 



Therefore, the product of the off-diagonal part —E"^ of i?" with bF (as used in algorithm [I]) 
is simply 

- [i?i:,,,5F,,,,], = (--^9^M,([/^,)(<5f/,,,,) - a:;;/,.//,,,,,. 






^-duN,iU-,)i5U,,,,) - /^:l,Jg.,j,k,, 



T 
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Figure 1: Three distribution functions in different space points of a computational domain 
for a re-entry problem 
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/rt(^ 




u{x) - cy/RT{x) 



u{x) 



u{x) + c-s/Wix) 



Figure 2: Support of a distribution function /(x, .): the distribution (in black), its corre- 
sponding Maxwellian (in blue), and the corresponding support based on the bulk velocity u 
and the temperature T. 
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Illlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



Figure 3: Three different distribution functions and a global grid tliat satisfies ((7])-([8]): tfie 
bounds are given by tlie largest distribution, tlie step is given by tlie narrowest distribution. 
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uniforin Cartesian grid 



locally refined grid 



Figure 4: Three different distribution functions (top), the corresponding uniform Carte- 
sian velocity grid (middle), and the grid locally refined in the support of the distributions 
(coarsened elsewhere). 
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Figure 5: The macroscopic CNS flow (left), and some values of at three different v (right). 
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M^ff^^ 



fa 




Figure 6: CNS velocity and temperature fields (left), corresponding fine Cartesian velocity 
grid (right). 
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Figure 7: Left: support function, Right: velocity grids (solid line: induced AMR velocity 
grid, dotted line: initial fine Cartesian grid). 
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Figure 8: Comparison of the normal component of the heat flux to the sohd wall: fluxes 
obtained with the flne Cartesian grid and the AMR grid (top), relative difference in percent 
(bottom). 
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Figure 9: velocity and temperature fields for the axisymmetric fiow around a sphere 
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Figure 10: 2D axisymmetric flow around a sphere. Top: uniform cylindrical grid (dashed 
lines) and AMR grid (solid). Bottom: contours of the support function and cylindrical AMR 
grid. 
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Figure 11: 2D Axisymmetric flow around a sphere: comparison of tlie normal component of 
the heat flux to the surface: fluxes obtained with the fine uniform grid (dot) and the AMR 
grid (solid) 
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Figure 12: 2D Axisymmetric flow around a sphere: contours of tlie support function and 
cylindrical AMR grid based on the Rankine-Hugoniot relations. 
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Figure 13: 3D flow around a cone. Top: Cartesian velocity grid (dashed line) and AMR grid 
(solid). Bottom: AMR grid and support function based on the CNS simulation. 
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Figure 14: 3D flow around a cone: pressure field computed with the kinetic code and the 
AMR velocity grid. 
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Figure 15: 3D flow around a cone: normal heat flux along the boundary (computed with the 
kinetic code and the AMR velocity grid). 
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